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Abstract 


Recent advances in imaging technology make it possible to obtain imagery data 
of the Earth at high spatial, spectral and radiometric resolutions from Earth orbiting 
satellites. The rate at which the data is collected from these satellites can far exceed 
the channel capacity of the data downlink. Reducing the data rate to within the chan- 
nel capacity can often require painful trade-offs in which certain scientific returns are 
sacrificed for the sake of others. In this paper we model the radiometric version of 
this form of lossy compression by dropping a specified number of least significant bits 
from each data pixel and compressing the remaining bits using an appropriate lossless 
compression technigue. We call this approach “truncation followed by lossless compres- 
sion” or TLLC. We compare the TLLC approach with applying a lossy compression 
technique to the data for reducing the data rate to the channel capacity, and demon- 
strate that each of three different lossy compression techniques (JPEG/DCT, VQ and 
and Model- Based VQ) give a better effective radiometric resolution than TLLC for a 
given channel rate. 


1 Introduction 


The imaging sensors onboard satellites are capable of scanning the Earth at very high spatial, 
spectral and radiometric resolutions. Downlink channel capacity is often a major limiting 
factor for the resolution at which the data is collected. Image compression techniques can be 
used to reduce the data rate from the imaging sensor to within the downlink channel capacity. 
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Ideally, decompression of the downlinked data should result in the full lossless recovery of 
the image data as sensed onboard the satellite. However, the amount of compression possible 
from lossless techniques is bounded by the entropy of the source. This entropy bound limits 
the amount of compression that can be obtained to the range of 2 to 3 for most NASA image 
data sources. This is most often insufficient to reduce the sensor data rate to within the 
channel capacity. 

Large amounts of compression can, instead, be obtained with lossy compression tech- 
niques. In fact, a crude form of lossy compression is most often used in these cases, i.e. 
the temporal, spatial, spectral, and/or radiometric resolutions are limited to produce a data 
rate that can be handled by the channel capacity. Establishing these limits often requires 
painful trade-offs in which certain scientific returns are sacrificed for the sake of others. In 
this paper we model the radiometric version of this form of lossy compression by truncating 
a specified number of least significant bits followed by lossless compression of the remaining 
higher order bits. We call this approach “Truncation followed by Lossless Compression” 
(TLLC). Using the TLLC approach, the data rate can be set to within the channel capacity 
by selecting the appropriate number of least significant bits dropped. We have found that 
this method produces reasonable rate distortion values for compression ratios less than 5 or 
6. However, for larger compression ratios, the rate distortions increase exponentially as the 
amount of truncation increases. 

Much better rate distortion behavior can be obtained by using other lossy compression 
approaches. For the lossy compression approaches we have studied, the rate distortion perfor- 
mance is either linear or sublinear. These lossy compression approaches are the JPEG/DCT 
(Joint Photographic Experts Group/Discrete Cosine Transform [1]), VQ (Vector Quantiza- 
tion [2], and the more recently developed MVQ (Model-based VQ [3]) approach. For a given 
data rate, this improved distortion behavior over TLLC can be looked upon as a gain in 
radiometric resolution. 

We first describe the TLLC approach in more detail, and give summary descriptions of 
the JPEG/DCT, VQ and MVQ lossy compression approaches. We then derive our measure 
of gain in radiometric resolution of a particular lossy compression approach over TLLC. 
Finally we demonstrate the gain in radiometric resolution provided by the JPEG/DCT, 
VQ and MVQ appproaches over the TLLC approach with imagery data from three remote 
sensing instruments: the Landsat Thematic Mapper (TM), the Advanced Solid-state Array 
Spectroradiometer (ASAS), and the Advanced Very High Resolution Radiometer (AVHRR). 
Of these, TM imagery data is at 8-bit resolution, while imagery data from the other two are 
at 12-bit pixel resolution with at most 10 significant bits. 
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2 Lossy Image Compression Techniques 


Lossy compression can produce relatively high compression ratios or low data rates (bit 
rates) at a cost of losing some information. Here we define the compression ratio (CR) to be 
the ratio of the number of bits in the original image to the number of bits in the compressed 
image. The bit rate in bits/pixel can be represented as n/CR, where n is the radiometric 
resolution (in bits/pixel) of the original image. A common measure of information loss or 
distortion is the mean squared error between the original image and the image reconstructed 
from the compressed data. The mean squared error is defined formally as 

MSE = i *))’ (i) 

iV k = 0 

where fi(k) and fi(k) are the k th pixels from the original and reconstructed images, respec- 
tively, and N is number of pixels in the image. The performance of a lossy compression 
technique can be characterized by a rate-distortion curve, which is simply a plot of bit rate 
(n/CR) versus distortion (MSE). 

In the following subsections we describe the TLLC approach and other lossy compression 
techniques that we have used in our tests. 


2.1 Truncation followed by Lossless Compression (TLLC) 


Truncation followed by Lossless Compression (TLLC) is not a compression approach that 
one would use directly. However, as mentioned in the introduction, it is a model for the 
design practice of setting the radiometric resolution to a lower value than sensor technology 
would allow, so as to keep the data rate produced by the sensor within the limits of channel 
capacity for bringing the data from the sensor to Earth. 

Let the radiometric resolution of the image data collected at the instrument be n bits/pixel 
and the channel capacity be m bits/pixel (m < n). The TLLC approach reduces the bit rate 
from n to no more than m by dropping a number of lower order bits b. Here b is chosen such 
that the lossless compression of remaining n-b bits results in an output bit rate of no more 
than m bits/pixel. The lossless compression approach that consistently performed best in 
the cases we tested utilizes the coding model for lossless encoding specified in the JPEG still 
image compression standard [1] combined with the Witten-Neal-Cleary version of arithmetic 
coding [9]. 
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2.2 JPEG/DCT 


JPEG/DCT([1]) lossy compression algorithm consists of three successive stages: Discrete 
Cosine Transform (DCT) transformation, coefficient quantization and lossless compression. 
The original image is partitioned into nonoverlapping 8x8 pixel blocks. Each block is in- 
dependently transformed using the DCT. The DCT coefficients are then quantized using a 
quantization table that is designed using the Human Visual System (HVS) contrast sensivity 
function. The first coeffient of DCT transformation is DC coefficient and is proportional to 
average brightness of the block. The quantized DC coefficient along with other DC coef- 
ficients is compressed using DPCM (Differential Pulse Code Modulation) using 1-D causal 
prediction. The quantized AC coefficients are zig-zag scanned to covert 2-D array into 1-D 
array and then are lossless compressed by using Huffman table that is transmitted to the 
decoder as a part of the header information. 

The baseline JPEG/DCT does not include standards for pixel resolutions higher than 
8-bits. Since some of the images tested here have 12 bit resolution, we truncated the image 
pixels such that the pixel resolution after truncation was 8-bits. After JPEG/DCT com- 
pression was applied and the image was reconstructed from the compressed data, each pixel 
value was multiplied by the truncation scale factor to scale the pixels values properly for 
MSE measurements. 

Spectral correlations are not easy to exploit in JPEG/DCT, as there are no standards for 
decorrelating the bands of multispectral image data (JPEG/DCT does however, allow red, 
green and blue decorrelations by converting them to luminance and chrominance components. 
([1], pp. 18-20, p.503). Therefore, we compressed each band of the multispectral images 
independently in our tests. 


2.3 Vector Quantization 


Vector Quantization (VQ) is the vector extension of scalar quantization which is found to be 
very useful for multispectral image compression ([4] [5]). The VQ vectors are obtained from 
image data by systematically extracting nonoverlapping blocks (typically 4x4) and arranging 
the pixels in each block in raster scan order. Such vectors allow VQ to exploit two dimen- 
stional correlations in the image data. If the image is multispectral, nonoverlapping cubes 
(typically 4x2x3) may be used. VQ builds up a dictionary of a few representative vectors, 
called codevectors, and then codes the image with the index value of the closest codevec- 
tor from the dictionary, called the codebook, in place of of each vector. Each codevector 
is represented by an address containing log?M bits, where M is number of codevectors in 
the codebook. Assume vectors of size k are drawn from the input image and matched with 
those in the codebook. Using the indices of the matched codevectors to represent the input 
image vectors results in a decreased rate of ( log 2 M)/k bits/pixel or a compression ration of 
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( k *n)/log 2 M, where n is the radiometric resolution of the image. In all practical situations 
the codebook size, M, is much smaller than the number of vectors that make up the input 
image. 

The most important phase of VQ is the training process in which an optimal codebook (by 
some criterion such as least MSE) is learned from the input samples. The most widely used 
algorithm is Linde-Buzo-Gray (LBG) algorithm ([6]). Both the training and coding phases of 
VQ require finding the codevector which is closest match to a given vector. Computing this 
closest match requires computations proportional to the size of the codebook. Computational 
cost can be reduced by employing a suboptimal approaches such as Tree Search Vector 
Quantization (TVSVQ) and Pruned Tree VQ (PTVQ) ([7]). The computational problems 
can also be solved by using a special architectures ([4]). While the codebook training and 
data encoding steps of VQ are computationally intensive, the decoding step is not, because 
it is a table lookup process that can be performed quickly on a conventional sequential 
computers. Obvious drawbacks of VQ are computationally intensive training process for 
generating codebooks for a given class of images and the maintainance of these codebooks 
at coding and decoding ends. At the encoding end a codebook has to be selected for the 
given data and a pointer to this codebook may be provided as a part of the header record in 
the compressed file for the decoder to use the same codebook for decoding purposes. This is 
one practical difficulty of using VQ for image compression. This problem is solved with the 
Model-based Vector Quantization (MVQ) approach, described in the next section, in which 
codebooks are generated using statistical models and input image covariance matrix. 


2.4 MVQ 


In the MVQ. the codebook is generated using a statistical model of mean removed resid- 
ual of the vectors. The mean removed vector elements are characterized either Gaussian or 
Laplacian error models. For small vectors sizes of 2 or 4, the mean removed vector elements 
can be simulated by a uniform random number generator producing independent and iden- 
tically distributed (i.i.d) random numbers and then passing them through a Laplacian filter 
with mean A. This is a reasonable model of generating mean removed residuals for these 
small vector sizes . However, as the vector size increases, the mean removed vector elements 
cannot be treated as independent and so a covariance structure of the source is imposed on 
Laplacian i.i.d process. For A:-element vectors, the covariance matrix, E, of the input image 
is a kxk matrix. The diagonal elements of E are approximately equal and correspond to the 
variance of the normalized pixel values in the image. The square root of E 0 o (= A) is used 
to generate independent and identically distributed (i.i.d) Laplacian random variables. The 
consecutive Laplacian i.i.d random numbers are grouped into vectors of size k = ( kl x k2) to 
form a vector W, ( i th vector). The covariance matrix, E, of the source is then factorized into 
L and U, where L and U are upper and lower triangular matrices, respectively. The factor- 
ization is performed by using the Cholesky decomposition algorithm. When the Laplacian 
vectors are mapped onto L, the resulting vectors will have same multivariate distribution 
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as E. The vectors thus generated are independent of other vectors. However, the vector 
elements have the correlations given by E. Let W, be the k-element vector generated by 
Laplacian i.i.d process. Let the L be the lower triangular matrix obtained by Cholesky’s 
decomposition of E. Now the codevector (which is i th codebook entry) is given by 


Xi = L*Wi 


These vectors are used as the code vectors for the source mean removed residual vec- 
tors. In the second pass input image is coded using the model codebook. The codebook is 
completely specified by a seed point of uniform random number generator, A, and the lower 
triangular matrix, L. The lower triangular matrix will have at most (k2 + k)/2 nonzero real 
numbers, where k is the size of the vector. Thus, by transmitting seed point of the uniform 
random number generator, A, and L in the header of coded file, the decoder can generate 
the codebook to decode the VQ coded image. 


3 Radiometric Resolution Gain of Lossy Compression 
Algorithms 


In the TLLC approach, the radiometric resolution of the input image is explicitly reduced 
by 6 bits by the truncation process. We show here that the MSE distortion resulting from 
the truncation varies exponentially with b , the loss in radiometric resolution. The relation 
between MSE distortion and loss of radiometric resolution can be derived as follows: 

When b lower order bits are dropped, the error in pixel may be one of the integers (0, 1, 
2. ..., 2 fc -l). Assuming a uniform distribution of these error pixel values, the expected mean 
squared error (MSE) is given by 


MSE= 1 r E P 

Z 1 k = 1 

(2) 

= (2 * 2 26 - 3 * 2 fc + l)/6 

(3) 


The uniform distribution assumption holds best for lower values of b. Equation (3) can be 
derived from (2) using the Euler- Maclaurin summation formula [8]. From Equation (3), we 
can obtain b in terms of MSE by solving the quadratic equation in 2 k and taking log 2 giving: 


b = log 2 ([ 3 + ^(48* MS£ + l)]/4) 


(4) 


Equation (4) can be used to compute the loss of radiometric resolution due to the mean 
squared error distortion for a give compression ratio. We can thus compare performance 
of lossy compression techniques in terms of radiometric efficiency. For a given compression 
ratio, let the MSE distortions from two lossy methods (for example, VQ and TLLC) be D\ 
and D 2 , respectively. Let bl and b2 be loss of radiometric resolutions from these methods 
that can be computed from Equation (4). Now if 61 > 62, there is gain in radiometric 
resolution, Ab, by using VQ instead of TLLC, which is given by 


61 — 62 = A6 = log 2 


3 + y/48 * Z>i + 1 
3 -I - 48 * T 1 


( 5 ) 


For large distortions Equation (5) can be simplified to give 



TV 

■TV 


( 6 ) 


Using Equation (6) lossy compression techniques can be compared in terms effective radio- 
metric gain by using one with lesser distortion than the other compression technique for a 
given rate. We have reported here the effective radiometric resolution gain of VQ, MVQ and 
JPEG/DCT with respect to TLLC. 


4 Experimental Results 


Three different multispectral image data sets are used in our experimentation. The first 
data set consists of spectral bands 1, 2, and 3 of a 2048-by-2048 pixel subimage of a Landsat 
Thematic Mapper (TM) scene collected in 1991 (path/row 46/28) from over the Gifford 
Ponchot National Forest in the state of Washington in the United States of America. The 
radiometric (pixel) resolution of this data is 8 bits. The second data set is the first two spec- 
tral bands from a 409x2048 pixel Global Area Coverage (GAC) data set from the Advanced 
Very High Resolution Radiometer (AVHRR) instrument taken from over the western pacific 
ocean. The pixel resolution of this data is 12 bits (stored as 16 bits per pixel). The third data 
set is made up of bands 22 and 23 from the Advanced Solid-state Array Spectroradiometer 
(ASAS) instrument. This data set also has 12 bit pixel resolution. We used for our test a 
512x420 pixel image designated 92161553 from Volume 4 of the FIFE CD-ROM series ([10]). 

A training data set is required for the VQ method. This training data set should be 
disjoint from the test data set, but should be from the same instrument with the same 
spectral bands and should have similar scene characteristics. We chose to use the first 512 
columns of the TM data set for testing, and trained on columns 513 through 2048 (for all 
2048 lines). The AVHRR data was divided into two equal parts. The first 1024 lines were 
used for testing, while the second 1024 lines were used for training. As mentioned above, we 
used bands 22 and 23 of ASAS data set 92161553 of size 512x420 for testing. For training 
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we used the same bands from the 512x590 pixel data set designated 92161621, the 512x600 
pixel data set designate 92161631, and 512x600 data set designated 92161727. The training 
data was used to generate codebooks for each instrument with vector sizes of 4, 8, 16 and 
32 so that compressed data at four different compression ratios could be obtained. 

The JPEG/DCT compression technique used here was implemented for 8-bit pixel res- 
olution images. To compress the 12-bit AVHRR and ASAS using JPEG/DCT, the images 
were first converted to 8-bit images by finding the brightest pixel (<7 m(lx ) and scaling down 
all the pixels by the factor g ma x /255. (MVQ can compress images of pixel resolutions 8-16, 
and does not need any codebooks for compression.) 

The compression results on the TM data set are given in Table 1.1. The table provides 
MSE distortions for different compression ratios using the four different compression meth- 
ods (TLLC, JPEG/DCT, VQ, and MVQ). The plots of CR vs. MSE are shown for the 
above four techniques on the TM data set are shown in Figure 1. The gain in radiometric 
resolution using JPEG/DCT, VQ and MVQ compared to TLLC are derived from the plots. 
For three CR’s, the MSE’s are measured from the plots and the A b is computed from Equa- 
tion (6). The radiometric resolution, A b , for different CR’s are given in Table 1.2 and the 
plots are shown in Figure 2. The results on AVHRR data are given in Table 2.1 and 2.2 and 
ASAS data results are given in Table 3.1 and 3.2. The rate distortion curves for AVHRR 
data and ASAS data using three lossy compressions compared to TLLC techniques in the 
plots shown in Figures 3 and 5 respectively. The gain in the radiometric resolution obtained 
by employing lossy compression techniques compared to TLLC are shown in Figure 4 for 
AVHRR data and Figure 6 for ASAS data. 


Table 1.1: CR Vs. MSE on TM data 
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Table 3.1: CR Vs. MSE on ASAS data 


TLLC 

JPEG 

VQ 

MVQ 

CR 

MSE 

CR 

MSE 

CR 

MSE 

CR 

MSE 

5.96 

0.5 

12.7 

6.5 

8.2 

10.9 

7.0 

22.0 

8.36 

3.47 

22.3 

16.8 

15.8 

23.4 

12.8 

81.5 

12.41 

17.4 

35.0 

28.0 

33.5 

37.0 

30.0 

100.3 

19.48 

77.3 

53 

50 

- 

- 

40.3 

200.1 

32.0 

304 

- 

- 

- 

- 

- 

- 


Table 3.2: A6 w.r.t TLLC for ASAS 


CR 

A b w.r.t TLLC 


JPEG 

VQ 

MVQ 

15.0 

0.95 

0.3 

0.00 

20.0 

1.30 

0.8 

0.00 

25.0 

1.52 

1.18 

0.43 

30.0 

1.64 

1.38 

0.65 


5 Conclusions 


The data rates possible from remote sensing instruments can often far exceed the channel 
capacity for downlinking this data to Earth. The required data rate reduction is often 
obtained by reducing the resolution of the instrument. We have modeled the radiometric 
version of this approach by dropping a number of least significant bits and applying an 
appropriate lossless compression method. We refer to this technique as Truncation followed 
by Lossless Compression (TLLC). We have shown in our study that using lossy compression 
techniques such as JPEG, VQ and MVQ would give a gain in radiometric resolution compared 
to TLLC for a given data rate. In our experiments on Landsat TM data, we have found 
that radiometric resolution improvements of 1 to 1.5 bits for bit rates ranging from 0.8 - 0.5 
or compression ratios of 10-20 with the VQ or JPEG techniques. Similar improvements are 
obtained for AVHRR data using VQ and JPEG techniques. However for ASAS data, the 
improvements are seen only for compression ratios exceeding 10 in the case of JPEG and 
VQ and 20 for MVQ. 
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Compression Ratio (rate = 8/CR) 


Figure 1. Rate-Distortion performance of lossy compression 
techniques and TLLC on the TM data set 
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Figure 2. Radiometric Resolution of Lossy Compression 
techniques on the TM data set 
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Figure 6. Radiometric Resolution of Lossy Compression 
techniques on ASAS data set 
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